Empirical Bernstein Boosting
نویسندگان
چکیده
Concentration inequalities that incorporate variance information (such as Bernstein’s or Bennett’s inequality) are often significantly tighter than counterparts (such as Hoeffding’s inequality) that disregard variance. Nevertheless, many state of the art machine learning algorithms for classification problems like AdaBoost and support vector machines (SVMs) extensively use Hoeffding’s inequalities to justify empirical risk minimization and its variants. This article proposes a novel boosting algorithm based on a recently introduced principle—sample variance penalization—which is motivated from an empirical version of Bernstein’s inequality. This framework leads to an efficient algorithm that is as easy to implement as AdaBoost while producing a strict generalization. Experiments on a large number of datasets show significant performance gains over AdaBoost. This paper shows that sample variance penalization could be a viable alternative to empirical risk minimization.
منابع مشابه
Variance Penalizing AdaBoost
This paper proposes a novel boosting algorithm called VadaBoost which is motivated by recent empirical Bernstein bounds. VadaBoost iteratively minimizes a cost function that balances the sample mean and the sample variance of the exponential loss. Each step of the proposed algorithm minimizes the cost efficiently by providing weighted data to a weak learner rather than requiring a brute force e...
متن کاملLarge Relative Margin and Applications
Large Relative Margin and Applications Pannagadatta K. Shivaswamy Over the last decade or so, machine learning algorithms such as support vector machines, boosting etc. have become extremely popular. The core idea in these and other related algorithms is the notion of large margin. Simply put, the idea is to geometrically separate two classes with a large separation between them; such a separat...
متن کاملPAC-Bayes-Empirical-Bernstein Inequality
We present a PAC-Bayes-Empirical-Bernstein inequality. The inequality is based on a combination of the PAC-Bayesian bounding technique with an Empirical Bernstein bound. We show that when the empirical variance is significantly smaller than the empirical loss the PAC-Bayes-Empirical-Bernstein inequality is significantly tighter than the PAC-Bayes-kl inequality of Seeger (2002) and otherwise it ...
متن کاملTesting Independence Based on Bernstein Empirical Copula and Copula Density
In this paper we provide three nonparametric tests of independence between continuous random variables based on Bernstein copula and copula density. The first test is constructed based on functional of Cramér-von Mises of the Bernstein empirical copula. The two other tests are based on Bernstein density copula and use Cramér-von Mises and Kullback-Leiber divergencetype respectively. Furthermore...
متن کاملEmpirical Bernstein Inequalities for U-Statistics
We present original empirical Bernstein inequalities for U-statistics with bounded symmetric kernels q. They are expressed with respect to empirical estimates of either the variance of q or the conditional variance that appears in the Bernsteintype inequality for U-statistics derived by Arcones [2]. Our result subsumes other existing empirical Bernstein inequalities, as it reduces to them when ...
متن کامل